import SetupEnv from './common/setup-env.mdx';

# MCP server

Midscene provides a MCP server that allows AI assistants to control Android devices, automate mobile app testing tasks.

:::info MCP Introduction
MCP ([Model Context Protocol](https://modelcontextprotocol.io/introduction)) is a standardized way for AI models to interact with external tools and capabilities. MCP servers expose a set of tools that AI models can invoke to perform various tasks. For Midscene, these tools allow AI models to connect to Android devices, launch apps, interact with UI elements, and more.
:::

## Use cases

- Execute automated testing on Android devices
- Control Android apps for UI interaction

## Setting up Midscene MCP

### Prerequisites

1. An OpenAI API key or another supported AI model provider. For more information, see [Choosing an AI Model](./choose-a-model).
2. [Android adb](https://developer.android.com/tools/adb?hl=zh-cn) tool installed and configured
3. Android device with USB debugging enabled and connected to your computer

### Configuration

Add the Midscene MCP server to your MCP configuration, note that the `MIDSCENE_MCP_ANDROID_MODE` environment variable is required:

```json
{
  "mcpServers": {
    "mcp-midscene": {
      "command": "npx",
      "args": ["-y", "@midscene/mcp"],
      "env": {
        "MIDSCENE_MODEL_NAME": "REPLACE_WITH_YOUR_MODEL_NAME",
        "OPENAI_API_KEY": "REPLACE_WITH_YOUR_OPENAI_API_KEY",
        "MIDSCENE_MCP_ANDROID_MODE": "true",
        "MCP_SERVER_REQUEST_TIMEOUT": "800000"
      }
    }
  }
}
```

For more information about configuring AI models, see [Choosing an AI Model](./choose-a-model).

## Available tools

Midscene MCP provides the following Android device automation tools:

| Category                         | Tool Name                     | Description                                       |
| -------------------------------- | ----------------------------- | ------------------------------------------------- |
| **Device Management**            | midscene_android_list_devices | List all connected Android devices                |
|                                  | midscene_android_connect      | Connect to a specific Android device              |
| **App Control**                  | midscene_android_launch       | Launch an app or open a webpage on Android device |
| **System Operations**            | midscene_android_back         | Press the back button on Android device           |
|                                  | midscene_android_home         | Press the home button on Android device           |
| **Page Interaction**             | midscene_aiTap                | Click on an element described in natural language |
|                                  | midscene_aiInput              | Input text into a form field or element           |
|                                  | midscene_aiKeyboardPress      | Press a specific keyboard key                     |
|                                  | midscene_aiScroll             | Scroll the page or a specific element             |
| **Verification and Observation** | midscene_aiWaitFor            | Wait for a condition to be true on the page       |
|                                  | midscene_aiAssert             | Assert that a condition is true on the page       |
|                                  | midscene_screenshot           | Take a screenshot of the current page             |

### Device management

- **midscene_android_list_devices**: List all connected Android devices available for automation

  ```
  Parameters: None
  ```

- **midscene_android_connect**: Connect to an Android device via ADB
  ```
  Parameters:
  - deviceId: (Optional) Device ID to connect to. If not provided, uses the first available device.
  - displayId: (Optional) Display ID for multi-display Android devices (e.g., 0, 1, 2). When specified, all ADB input operations will target this specific display.
  ```

### App control

- **midscene_android_launch**: Launch an app or navigate to a URL on Android device
  ```
  Parameters:
  - uri: Package name, activity name, or URL to launch
  ```

### System operations

- **midscene_android_back**: Press the back button on Android device

  ```
  Parameters: None
  ```

- **midscene_android_home**: Press the home button on Android device
  ```
  Parameters: None
  ```

### Page interaction

- **midscene_aiTap**: Click on an element described in natural language

  ```
  Parameters:
  - locate: Natural language description of the element to click
  ```

- **midscene_aiInput**: Input text into a form field or element

  ```
  Parameters:
  - value: The text to input
  - locate: Natural language description of the element to input text into
  ```

- **midscene_aiKeyboardPress**: Press a specific keyboard key

  ```
  Parameters:
  - key: The key to press (e.g., 'Enter', 'Tab', 'Escape')
  - locate: (Optional) Description of element to focus before pressing the key
  - deepThink: (Optional) If true, uses more precise element location
  ```

- **midscene_aiScroll**: Scroll the page or a specific element
  ```
  Parameters:
  - direction: 'up', 'down', 'left', or 'right'
  - scrollType: 'once', 'untilBottom', 'untilTop', 'untilLeft', or 'untilRight'
  - distance: (Optional) Distance to scroll in pixels
  - locate: (Optional) Description of the element to scroll
  - deepThink: (Optional) If true, uses more precise element location
  ```

### Verification and observation

- **midscene_aiWaitFor**: Wait for a condition to be true on the page

  ```
  Parameters:
  - assertion: Natural language description of the condition to wait for
  - timeoutMs: (Optional) Maximum time to wait in milliseconds
  - checkIntervalMs: (Optional) How often to check the condition
  ```

- **midscene_aiAssert**: Assert that a condition is true on the page

  ```
  Parameters:
  - assertion: Natural language description of the condition to check
  ```

- **midscene_screenshot**: Take a screenshot of the current page
  ```
  Parameters:
  - name: Name for the screenshot
  ```

## Common issues

### How to connect an Android device?

1. Ensure Android SDK is installed and ADB is configured
2. Enable Developer Options and USB debugging on your Android device
3. Connect the device to your computer via USB cable
4. Run `adb devices` to confirm the device is connected
5. Use `midscene_android_list_devices` in MCP to view available devices

### How to launch an Android app?

Use the `midscene_android_launch` tool with parameters that can be:

- App package name: e.g., `com.android.chrome`
- Activity name: e.g., `com.android.chrome/.MainActivity`
- Web URL: e.g., `https://www.example.com`

### Local port conflicts when multiple clients are used

> Problem description

When users simultaneously use Midscene MCP in multiple local clients (Claude Desktop, Cursor MCP, etc.), port conflicts may occur causing server errors

> Solution

- Temporarily close the MCP server in the extra clients
- Execute the command:

```bash
# For macOS/Linux:
lsof -i:3766 | awk 'NR>1 {print $2}' | xargs -r kill -9

# For Windows:
FOR /F "tokens=5" %i IN ('netstat -ano ^| findstr :3766') DO taskkill /F /PID %i
```

### How to access Midscene execution reports

After each task execution, a Midscene task report is generated. You can open this HTML report directly from the command line:

```bash
# Replace the opened address with your report filename
open report_file_name.html
```

![image](https://lf3-static.bytednsdoc.com/obj/eden-cn/ozpmyhn_lm_hymuPild/ljhwZthlaukjlkulzlp/midscene/image.png)
